SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Extended search

db:Swepub
 

Search: db:Swepub > Chen Xiaowen > Efficient Memory Ac...

Efficient Memory Access and Synchronization in NoC-based Many-core Processors

Chen, Xiaowen, 1982- (author)
KTH,Elektronik och inbyggda system
Lu, Zhonghai, Professor (thesis advisor)
KTH,Elektroniksystemkonstruktion,Elektronik och inbyggda system
Ogras, Umit, Assistant Professor (opponent)
School of ECEE, Arizona State University
 (creator_code:org_t)
ISBN 9789178730513
Stockholm, Sweden : KTH Royal Institute of Technology, 2019
English 164 s.
Series: TRITA-EECS-AVL ; 2019:2
  • Doctoral thesis (other academic/artistic)
Abstract Subject headings
Close  
  • In NoC-based many-core processors, memory subsystem and synchronization mechanism are always the two important design aspects, since mining parallelism and pursuing higher performance require not only optimized memory management but also efficient synchronization mechanism. Therefore, we are motivated to research on efficient memory access and synchronization in three topics, namely, efficient on-chip memory organization, fair shared memory access, and efficient many-core synchronization.One major way of optimizing the memory performance is constructing a suitable and efficient memory organization. A distributed memory organization is more suitable to NoC-based many-core processors, since it features good scalability. We envision that it is essential to support Distributed Shared Memory (DSM) because of the huge amount of legacy code and easy programming. Therefore, we first adopt the microcoded approach to address DSM issues, aiming for hardware performance but maintaining the flexibility of programs. Second, we further optimize the DSM performance by reducing the virtual-to-physical address translation overhead. In addition to the general-purpose memory organization such as DSM, there exists special-purpose memory organization to optimize the performance of application-specific memory access. We choose Fast Fourier Transform (FFT) as the target application, and propose a multi-bank data memory specialized for FFT computation.In 3D NoC-based many-core processors, because processor cores and memories reside in different locations (center, corner, edge, etc.) of different layers, memory accesses behave differently due to their different communication distances. As the network size increases, the communication distance difference of memory accesses becomes larger, resulting in unfair memory access performance among different processor cores. This unfair memory access phenomenon may lead to high latencies of some memory accesses, thus negatively affecting the overall system performance. Therefore, we are motivated to study on-chip memory and DRAM access fairness in 3D NoC-based many-core processors through narrowing the round-trip latency difference of memory accesses as well as reducing the maximum memory access latency.Barrier synchronization is used to synchronize the execution of parallel processor cores. Conventional barrier synchronization approaches such as master-slave, all-to-all, tree-based, and butterfly are algorithm oriented. As many processor cores are networked on a single chip, contended synchronization requests may cause large performance penalty. Motivated by this, different from the algorithm-based approaches, we choose another direction (i.e., exploiting efficient communication) to address the barrier synchronization problem. We propose cooperative communication as a means and combine it with the master-slave algorithm and the all-to-all algorithm to achieve efficient many-core barrier synchronization. Besides, a multi-FPGA implementation case study of fast many-core barrier synchronization is conducted.

Subject headings

TEKNIK OCH TEKNOLOGIER  -- Elektroteknik och elektronik -- Inbäddad systemteknik (hsv//swe)
ENGINEERING AND TECHNOLOGY  -- Electrical Engineering, Electronic Engineering, Information Engineering -- Embedded Systems (hsv//eng)
TEKNIK OCH TEKNOLOGIER  -- Elektroteknik och elektronik -- Datorsystem (hsv//swe)
ENGINEERING AND TECHNOLOGY  -- Electrical Engineering, Electronic Engineering, Information Engineering -- Computer Systems (hsv//eng)

Keyword

Many-core
Network-on-Chip
Distributed Shared Memory
Microcode
Virtual-to-physical Address Translation
Memory Access Fairness
Barrier Synchronization
Cooperative Communication
Informations- och kommunikationsteknik
Information and Communication Technology

Publication and Content Type

vet (subject category)
dok (subject category)

Find in a library

To the university's database

Find more in SwePub

By the author/editor
Chen, Xiaowen, 1 ...
Lu, Zhonghai, Pr ...
Ogras, Umit, Ass ...
About the subject
ENGINEERING AND TECHNOLOGY
ENGINEERING AND ...
and Electrical Engin ...
and Embedded Systems
ENGINEERING AND TECHNOLOGY
ENGINEERING AND ...
and Electrical Engin ...
and Computer Systems
Parts in the series
By the university
Royal Institute of Technology

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view